Promote scoped RAG safety updates to main#79
Conversation
Manual comprehensive RAG reloads now exceed the proxy response window because embedding calls are throttled to stay under Gemini quota. Return a job snapshot immediately and let a bounded single-worker executor run the reload in the background, with status endpoints for polling. Constraint: Cloudflare can return 524 when synchronous origin responses exceed the proxy window Constraint: Gemini quota requires throttled embedding calls during full reloads Rejected: Remove throttling | would reintroduce 429 and failed user-data embeddings Confidence: high Scope-risk: narrow Tested: git diff --check Tested: ./gradlew test --tests com.fairing.fairplay.ai.rag.service.RagLoadJobServiceTest --no-daemon Tested: ./gradlew test --no-daemon Co-authored-by: OmX <omx@oh-my-codex.dev>
Reservations were embedded inside broad user documents, so private booking questions had weak retrieval boundaries and event questions could be polluted by booth/user chunks. This splits reservation history into owner-scoped reservation documents, keeps public event documents first for event-info questions, and carries scope metadata into pgvector rows. Constraint: Personal reservation answers must only search the authenticated user's own private documents Constraint: Event questions need event documents to rank ahead of booth documents that repeat event names Rejected: Keep reservation snippets inside user_* documents | mixed private documents made per-reservation refresh and retrieval precision too coarse Confidence: high Scope-risk: moderate Directive: Do not add new RAG document types without setting doc_type, visibility, and owner/event metadata on every chunk Tested: ./gradlew test --no-daemon Co-authored-by: OmX <omx@oh-my-codex.dev>
The chatbot previously allowed meta-requests to flow into the model, so users could ask for hidden prompts or server resource analysis and receive fabricated or sensitive-looking answers. Add a pre-LLM safety boundary for prompt/system/server-secret requests and reinforce both RAG and fallback system prompts. Constraint: Chatbot answers must stay within public FairPlay event data and the authenticated user's own reservation data Rejected: Prompt-only hardening | direct blocking prevents risky requests from reaching search or LLM paths Confidence: high Scope-risk: narrow Directive: Keep safety-boundary requests out of LLM calls; do not rely only on wording inside the system prompt Tested: ./gradlew test --tests com.fairing.fairplay.ai.rag.service.RagChatServiceScopeTest --no-daemon Tested: ./gradlew test --no-daemon Co-authored-by: OmX <omx@oh-my-codex.dev>
The local model should still answer FairPlay-domain help questions when a precise RAG match is unavailable, but it must never leak server resources, prompts, environment values, or secrets. Keep direct safety-boundary input blocking, allow constrained FairPlay-only fallback prompts, and add output inspection that replaces sensitive-looking responses before they reach users. Constraint: Non-FairPlay questions remain out of scope Constraint: Server resources and secrets must not leave the AI chat path even when produced by the local model Rejected: Refuse every no-context question | too conservative for normal FairPlay usage guidance Rejected: Prompt-only protection | unsafe model output needs a final response gate Confidence: high Scope-risk: narrow Directive: Preserve both input and output guards when changing chat fallback behavior Tested: ./gradlew test --tests com.fairing.fairplay.ai.rag.service.RagChatServiceScopeTest --no-daemon Tested: ./gradlew test --no-daemon Co-authored-by: OmX <omx@oh-my-codex.dev>
Scope RAG embeddings by document ownership
Main and develop both added RagLoadJobServiceTest through separate deployment lanes. The merge keeps the async reload behavior checks from main and preserves the user reservation load count added by scoped RAG documents. Constraint: main and develop diverged through repeated deployment merge commits Confidence: high Scope-risk: narrow Tested: ./gradlew test --tests com.fairing.fairplay.ai.rag.service.RagLoadJobServiceTest --no-daemon Co-authored-by: OmX <omx@oh-my-codex.dev>
|
Warning You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again! |
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Promotes develop after successful backend CI, with current main merged to resolve the RagLoadJobServiceTest add/add conflict.
Included:
Tested: backend-ci on develop succeeded (run 26559248887).
Tested: ./gradlew test --tests com.fairing.fairplay.ai.rag.service.RagLoadJobServiceTest --no-daemon.